Supporting Fault-Tolerant Parallel Programming in Linda

نویسندگان

  • David E. Bakken
  • Richard D. Schlichting
چکیده

Linda is a language for programming parallel applications whose most notable feature is a distributed shared memory called tuple space. While suitable for a wide variety of programs, one shortcoming of the language as commonly defined and implemented is a lack of support for writing programs that can tolerate failures in the underlying computing platform. This paper describes FT-Linda, a version of Linda that addresses this problem by providing two major enhancements that facilitate the writing of fault-tolerant applications: stable tuple spaces and atomic execution of tuple space operations. The former is a type of stable storage in which tuple values are guaranteed to persist across failures, while the latter allows collections of tuple operations to be executed in an all-or-nothing fashion despite failures and concurrency. The design of these enhancements is presented in detail and illustrated by examples drawn from both the Linda and fault-tolerance domains. An implementation of FT-Linda for a network of workstations is also described. The design is based on replicating the contents of stable tuple spaces to provide failure resilience and then updating the copies using atomic multicast. This strategy allows an efficient implementation in which only a single multicast message is needed for each atomic collection of tuple space operations.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Supporting Fault-Tolerant Parallel Programming in Linda1

Linda is a language for programming parallel applications whose most notable feature is a distributed shared memory called tuple space. While suitable for a wide variety of programs, one shortcoming of the language as commonly defined and implemented is a lack of support for writing programs that can tolerate failures in the underlying computing platform. This paper describes FT-Linda, a versio...

متن کامل

Fault-tolerant Distributed Applications In LiPS

Performing computations using networks of workstations is increasingly becoming an alternative to using a supercomputer. This approach is motivated by the the vast quantities of unused idle-time available in workstation networks. Unlike computing on a tightly coupled parallel computer, where a xed number of processor nodes is used within a computation, the number of useable nodes in a workstati...

متن کامل

Fault Tolerance Lessons Applied to Parallel Computing

This paper describes an approach to fault-tolerant parallel computing which is based on the experiences with the most successful fault-tolerant software – the transaction processing systems. The algorithms presented here have less runtime overhead and faster recovery than most preceding approaches. In the Pact parallel programming environment fault tolerance is provided fully user transparent i...

متن کامل

Recovery with limited replay: fault-tolerant processes in Linda

Research in the area of fault-tolerant distributed systems has focused to a large extent on data surviving various forms of failure. The replica control algorithms for maintaining mutually consistent replicas abound in number. However, comparatively little work has been devoted to making processes recoverable. In domains other than databases and transaction processing, faulttolerance generally ...

متن کامل

Adding Fault-tolerant Transaction Processing to LINDA

To simplify the difficult task of writing fault-tolerant parallel software, we implemented extensions to the basic functionality of the LINDA or tuple-space programming model. Our approach implements a mechanism of transaction processing to ensure that tuples are properly handled in the event of a node or communications failure. If a process retrieving a tuple fails to complete processing or a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IEEE Trans. Parallel Distrib. Syst.

دوره 6  شماره 

صفحات  -

تاریخ انتشار 1995